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1.  Introduction 

Digital  mammography  has  certain  advantages  over  conventional  screen-film  mammography1’2. 
These  include  the  ability  to  manipulate  images  by  changing  the  look-up-table,  zooming  and 
scrolling,  imaging  processing,  and  rapidly  sending  the  images  to  remotely  located  experts  for 
consultation. 

There  are  two  ways  to  generate  digital  mammograms:  a)  secondary  digitization,  in  which 
conventional  mammography  films  are  digitized;  and  b)  direct  acquisition  of  primary  digital  images. 
Currently,  high  resolution  (50  pm)  film  digitizers  are  commercially  available  for  secondary 
digitization  and  the  image  quality  satisfies  most  diagnostic  requirements  in  mammography. 
However,  primary  full-field  digital  mammography  units  are  not  yet  commercially  available. 

In  secondary  digitization,  it  has  been  shown  that  a  50-100  pm  sampling  distance  is  required  in 
order  to  achieve  an  adequate  resolution  for  digital  mammography3’4.  At  these  high  resolutions,  the 
digitization  time  is  long  and  the  images  created  are  large,  normally  between  10-40  MBytes/image. 
Both  the  storage  and  transmission  cost  of  such  large  images  will  increasingly  become  a  problem  as 
more  digital  mammograms  are  created. 

Image  compression  reduces  data  storage  requirements  while  preserving  useful 
information5’6*7.  It  provides  a  solution  to  handling  such  large  digital  images  efficiently.  Most 
existing  compression  methods  are  developed  for  engineering  purposes.  They  do  not  consider  the 
image  characteristic  and  stringent  requirements  of  image  quality  in  medicine.  The  purpose  of  this 
research  is  to  develop  loss  and  lossless  compression  techniques  for  digital  mammograms. 

In  the  previous  work,  we  investigated  both  lossless  and  lossy  compression  methods8.  In  the 
lossless  compression,  we  developed  a  structure  lossless  compression  method  for  mammograms. 
The  algorithm  utilizes  the  unique  shape  characteristics  and  image  characteristics  of  mammograms. 
The  combination  of  the  segmentation  and  the  prediction  coding  enables  us  to  achieve  high 
compression  ratios  without  losing  any  useful  information.  We  also  developed  a  lossless 
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compression  method  using  a  wavelet  transform.  The  optimal  wavelet  filter  and  the  quantization 
strategy  were  not  optimized  in  the  preliminary  study. 

The  research  of  the  past  year  includes  two  parts.  In  the  first  part  of  the  study,  we  proposed  a 
method  to  combine  the  proposed  lossless  compression  algorithm  with  a  digitization  process  to 
increase  the  overall  throughput  of  the  digitization  and  compression  operations.  Conventionally, 
digitization  and  compression  of  mammograms  are  done  separately.  Compression  is  applied  to  a 
full  digital  image  after  it  has  been  acquired  through  digitization.  In  this  research,  the  proposed 
lossless  compression  process  is  implemented  in  the  acquisition  computer  connected  to  the  scanner, 
running  concurrently  with  the  digitization  process.  The  implementation  of  the  compression 
algorithm  is  such  that  there  is  no  extra  time  required  for  compression.  In  the  second  part  of  the 
study,  we  investigated  the  optimal  wavelet  filter  bank  and  the  optimal  quantization  strategy  for  the 
wavelet  compression. 

2.  Body 


2.1  Method 

In  this  section,  we  will  first  describe  the  on-line  structure  lossless  compression  method  and 
then  describe  the  optimal  parameter  for  the  wavelet  compression. 

2.1.1  On-line  Compression  and  Digitization  Processes 

The  on-line  lossless  compression  method  was  derived  from  the  previous  work.  It  consists  of 
two  steps.  The  first  step  segments  the  breast  image  from  its  background.  The  pixels  beyond  the 
boundary  of  the  breast  are  discarded.  The  second  step  compresses  the  remaining  portion  of  the 
image  using  a  predictive  lossless  compression  technique.  Both  segmentation  and  lossless 
compression  can  be  applied  to  one  line  of  the  image  data  at  a  time. 

The  segmentation  process  scans  one  line  of  the  image  at  a  time  to  detect  the  boundary  of  the 
breast  image.  Each  line  of  data  is  scanned  to  find  the  boundary  points.  The  information  of  the 
previous  lines  of  the  image  are  used  to  determine  the  current  boundary  point.  The  next  step  of  the 
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structure-lossless  compression  uses  predictive  coding9  to  compress  the  segmented  breast  image. 
The  prediction  starts  from  the  boundary  of  the  breast  and  moves  to  the  chest  wall  direction.  The 
current  pixel  is  predicted  from  the  previous  pixel  in  the  same  line.  Since  the  neighboring  pixels  are 
correlated,  the  difference  between  adjacent  pixels  is  generally  small. 

Huffman  coding10  with  a  pre-determined  table  is  used  to  code  the  resulting  differences  of  the 
pixels.  The  Huffman  table  is  determined  by  a  set  of  randomly  selected  mammograms.  If  an 
individual  pixel  is  not  in  the  table,  it  is  recorded  separately. 

A  film  digitizer  is  controlled  via  an  acquisition  computer.  The  digitizer  scans  (or  digitizes)  a 
film  one  line  at  a  time  from  the  left  to  the  right,  advancing  line  by  line  from  the  top  to  the  bottom  of 
the  film.  The  data  generated  is  first  stored  in  an  internal  buffer  in  the  digitizer.  When  the  buffer  is 
full,  the  data  is  transferred  to  the  acquisition  computer  which  has  a  larger  size  memory  buffer  that 
will  hold  the  entire  image.  This  scanning  and  transferring  process  continues  several  lines  at  a  time, 
until  the  whole  film  is  scanned. 

During  the  conventional  digitization  process  the  sole  responsibility  of  the  acquisition  computer 
is  to  read  the  data  from  the  digitizer's  memory  buffer  into  its  own  memory  buffer.  So  the 
acquisition  computer  CPU  is  mostly  idle  when  no  data  is  being  transferred.  The  time  to  digitize  a 
film  is  determined  by  the  digitizer  scanning  speed  and  the  film  size.  The  scanning  speed  is 
normally  50-100  lines/second,  depending  on  the  sampling  distances. 

We  utilize  the  computer’s  idle  CPU  time  to  compress  the  digitized  image  during  digitization. 
The  compression  is  implemented  as  a  second  process  running  concurrently  with  the  digitization 
process  in  the  acquisition  computer  (Figure  1).  Both  processes  share  the  same  image  buffer.  A 
semaphore  is  used  to  control  the  access  of  the  shared  image  memory  by  the  two  processes. 

First,  the  scanning  process  reads  several  lines  of  data  from  scanner  and  writes  into  the 
acquisition  memory  buffer  whenever  there  is  data  ready  in  the  scanner's  temporary  buffer.  After 
writing  data  each  time,  the  scanning  process  increases  the  semaphore  by  one. 
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Scanning  Process 


Compression  Process 


Figure  1 .  Data  flow  of  the  concurrent  scanning  and  compression  processes 

Meanwhile,  the  compression  process  checks  the  semaphore.  If  the  semaphore  indicates  new 
data  in  the  acquisition  computer  memory,  the  compression  process  reads  these  several  lines  of  data 
from  the  image  buffer,  finds  the  boundary  of  the  breast  and  compresses  this  breast  image  portion 
with  predictive  coding  method  as  described  in  the  previous  section.  The  compression  process  then 
decreases  the  semaphore  by  one.  If  there  is  no  new  data  in  the  memory,  the  semaphore  blocks  the 
memory  and  stops  the  compression  process  until  new  data  arrives  in  the  memory.  These  processes 
continue  until  an  entire  film  has  been  scanned  and  the  image  is  compressed. 

If  the  compression  is  faster  than  the  scanning,  compression  can  be  done  as  soon  as  the 
scanning  finishes.  The  scanning  rate  is  normally  40  to  100  lines/sec.  We  have  implemented  this 
on-line  compression  algorithm  in  a  film  digitizer.  In  this  prototype  system,  the  compression  rate 
achieved  is  faster  than  the  scanning  rate. 
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2.1.2  The  Optimal  Wavelet  and  Quantization  Parameters 


A  diagram  of  a  wavelet  compression11’12  is  shown  in  Figure  2.  A  two-dimensional  (2D) 
wavelet  transform  is  first  applied  to  a  image  data  resulting  in  a  multiresolution  representation  of  the 
image.  Then  the  wavelet  coefficients  are  quantized  using  scalar  quantization.  Finally,  run-length 
and  Huffman  coding  are  used  to  code  the  quantized  data. 


Figure  2.  Wavelet  compression  scheme 


Choice  of  wavelet  filter  bank  is  very  important  for  high  performance  image  compression.  We 
evaluated  the  compression  performance  of  a  group  wavelet  filter  banks,  and  determined  the  best 
filter  bank  for  mammograms.  A  total  of  26  filter  banks  were  selected  and  the  performance  of 
compression  ratio  versus  peak  signal  to  noise  ratio  (PSNR)  was  compared.  The  optimal  wavelet 
filter  bank  was  selected  according  to  the  compression  performance. 

A  wavelet  transform  separates  an  image  into  different  frequency  subbands.  The  previous  study 
showed  that  four  levels  of  wavelet  decomposition  yields  the  optimal  compression  results.  The 
quantization  step  among  each  subband  was  selected  in  order  to  determine  the  optimal  quantization 
strategy  for  a  uniform  quantization.  An  adaptive  quantization  according  to  the  standard  deviation 
of  the  each  subband  was  compared  with  the  optimal  uniform  quantization. 

2.2.  Results 

2.2.1  On-line  Compression 

We  first  compared  the  times  required  to  digitize  and  to  compress  an  image  in  two  situations:  (1) 
where  digitization  and  compression  processes  were  done  separately,  (2)  and  where  digitization  and 
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compression  were  combined.  A  mammogram  was  digitized  at  different  sampling  distances  ranging 
from  50  jam  to  200  pm  with  a  50  pm  increment.  The  times  required  for  each  process  were 
measured. 

Table  1  shows  the  results.  The  second  and  third  rows  are  the  times  required  to  digitize  films 
and  to  compress  images,  respectively,  when  they  are  done  separately.  The  fourth  row  shows  the 
time  required  when  digitization  and  compression  are  combined.  Comparing  the  second  and  the 
fourth  row,  there  is  no  extra  time  required  to  compress  an  image  when  the  compression  is  done 
on-line. 


Table  1.  The  scanning  and  compression  time  at  different  sampling  distances 


Sampling  Distance  (pm) 

200 

150 

100 

50 

Scanning  (sec) 

15 

15 

28 

103 

Compression  (sec) 

7 

10 

22 

102 

Scanning  with  On-line 
Compression  (sec) 

15 

15 

28 

103 

We  also  compared  the  compression  ratios  for  46  randomly  selected  mammograms.  The 
compression  ratio  is  defined  as  full  image  size  (digitized  as  a  full  size  film)  divided  by  the 
compressed  image  size.  The  compression  ratios  for  46  mammograms  are  ranged  between  3.2  :  1 
to  8.9  :  1  with  an  average  ratio  of  5.65  :  1  and  a  standard  deviation  of  1.46. 

The  distribution  of  compression  ratios  for  46  mammograms  is  shown  in  Figure  3.  About  70% 
of  the  images  have  compression  ratios  between  4  :  1  and  7:1.  About  20  %  of  the  images  have 
compression  ratios  larger  than  7:1.  Only  10%  of  the  images  have  compression  ratios  between  3  : 
1  and  4  :  1. 

We  compared  the  compression  ratios  achieved  at  the  different  sampling  distances.  Five 
mammograms  were  digitized  with  different  sampling  distances.  The  compression  ratios  were 
averaged  at  each  sampling  distance.  Table  2  lists  the  results. 
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20 


Figure  3.  The  distribution  of  the  compression  ratios 

The  compression  ratio  increases  slightly  as  the  resolution  increases  (sampling  distance 
decreases).  This  is  because  as  the  sampling  distance  decreases,  the  correlation  between  pixels 
becomes  higher,  the  predicted  error  becomes  smaller,  and  the  compression  ratio  is  higher. 


Table  2.  The  compression  ratio  at  different  sampling  distances 


Sampling  Distance  (|im) 

200 

150 

100 

50 

Image  Size  (Mb) 

2.2 

3.84 

8.6 

34.6 

Compression  Ratio 

7.12 

7.24 

7.39 

7.83 

2.2.2  Wavelet  Compression 

Among  26  filter  banks  evaluated,  five  of  them  gave  good  compression  performance.  We 
selected  the  shortest  filter  bank,  9/7  tap  filter,  among  the  five  best  filter  banks  and  compared  it 
with  the  popular  Daubechies4  (D)  and  Haar  filters.  Figure  4  shows  the  results.  At  the  same 
PSNR  of  44  dB,  the  compression  ratio  using  the  9/7  filter  banks  is  about  40  %  higher  than  that  of 
the  Haar  filter  bank. 
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Figure  4.  The  performance  of  the  different  wavelet  filter  banks. 

The  optimal  quantization  parameters  were  determined  experimentally.  We  varied  the 
quantization  steps  between  two  adjacent  levels  at  a  constant  ratio  C.  The  best  ratio  was  selected. 
The  adaptive  quantization  where  the  quantization  step  size  proportional  to  the  standard  deviation  of 
each  subband,  were  also  compared  with  the  constant  ratio  quantization.  The  results  are  shown  in 
Figure  5.  The  optimal  quantization  is  achieved  when  the  quantization  step  size  is  the  same  among 
all  subband,  i.e.  C=l. 


PSNR 


Figure  5  Comparsion  of  different  quantization  parameters 
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2.3  Discussions 


The  work  of  the  on-line  compression  was  not  in  the  original  proposal.  However,  we  felt  the 
work  is  important  for  the  development  of  teleradiology  and  other  applications.  The  digitization 
time  is  relatively  long  for  a  mammogram.  The  on-line  compression  combines  the  digitization  and 
compression  processes  in  parallel.  The  image  is  compressed  where  it  is  digitized  and  that  cuts  the 
digitization  and  the  compression  time  up  to  50%. 

We  completed  the  development  of  the  mammogram  compression  using  a  wavelet  transform 
stated  in  the  original  proposal.  However,  due  to  the  early  termination  of  the  fellowship,  the  last 
part  of  evaluating  compressed  mammogram  quality  will  not  be  completed. 

3.  Conclusions 

We  investigated  both  lossless  and  lossy  compression  methods  for  mammograms  in  this  study. 
A  structure-lossless  compression  algorithm  segments  a  breast  image  from  it  background  and  only 
compresses  the  image  portion.  The  combination  of  the  segmentation  and  the  prediction  coding 
enables  us  to  achieve  high  compression  ratios  without  losing  any  useful  information.  This 
structure-lossless  compression  was  also  modified  and  implemented  concurrently  with  a  digitization 
process  in  an  acquisition  machine  .  The  result  is  that  a  mammogram  can  be  compressed  at  the 
same  time  when  it  is  digitized.  There  is  no  extra  time  required  for  compression.  This  on-line 
compression  could  potentially  be  used  for  telemammography  and  other  digital  applications  to 
improve  overall  digitization  and  compression  performance. 

A  wavelet  compression  method  was  developed  for  mammograms.  A  2D  wavelet  transform 
was  first  applied  to  a  digital  mammogram.  A  uniform  quantization  was  applied  to  subband  image. 
Run-length  coding  followed  by  Huffman  coding  was  applied  to  quantized  data.  The  wavelet  filter 
bank  and  the  quantization  parameters  were  optimized  for  mammograms. 
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